20 research outputs found

    If You Can't Kill a Supermutant, You Have a Problem

    Get PDF
    Quality of software test suites can be effectively and accurately measured using mutation analysis. Traditional mutation involves seeding first and sometimes higher order faults into the program, and evaluating each for detection.However, traditional mutants are often heavily redundant, and it is desirable to produce the complete matrix of test cases vs mutants detected by each. Unfortunately, even the traditional mutation analysis has a heavy computational footprint due to the requirement of independent evaluation of each mutant by the complete test suite, and consequently the cost of evaluation of complete kill matrix is exorbitant.We present a novel approach of combinatorial evaluation of multiple mutants at the same time that can generate the complete mutant kill matrix with lower computational requirements.Our approach also has the potential to reduce the cost of execution of traditional mutation analysis especially for test suites with weak oracles such as machine-generated test suites, while at the same time liable to only a linear increase in the time taken for mutation analysis in the worst case

    How to Compare Fuzzers

    Full text link
    Fuzzing is a key method to discover vulnerabilities in programs. Despite considerable progress in this area in the past years, measuring and comparing the effectiveness of fuzzers is still an open research question. In software testing, the gold standard for evaluating test quality is mutation analysis, assessing the ability of a test to detect synthetic bugs; if a set of tests fails to detect such mutations, it will also fail to detect real bugs. Mutation analysis subsumes various coverage measures and provides a large and diverse set of faults that can be arbitrarily hard to trigger and detect, thus preventing the problems of saturation and overfitting. Unfortunately, the cost of traditional mutation analysis is exorbitant for fuzzing, as mutations need independent evaluation. In this paper, we apply modern mutation analysis techniques that pool multiple mutations; allowing us, for the first time, to evaluate and compare fuzzers with mutation analysis. We introduce an evaluation bench for fuzzers and apply it to a number of popular fuzzers and subjects. In a comprehensive evaluation, we show how it allows us to assess fuzzer performance and measure the impact of improved techniques. While we find that today's fuzzers can detect only a small percentage of mutations, this should be seen as a challenge for future research -- notably in improving (1) detecting failures beyond generic crashes (2) triggering mutations (and thus faults).Comment: 13 pages, 4 figure

    Detecting Information Flow by Mutating Input Data

    Get PDF
    Analyzing information flow is central in assessing the security of applications. However, static and dynamic analyses of information flow are easily challenged by non-available or obscure code. We present a lightweight mutation-based analysis that systematically mutates dynamic values returned by sensitive sources to assess whether the mutation changes the values passed to sensitive sinks. If so, we found a flow between source and sink. In contrast to existing techniques, mutation-based flow analysis does not attempt to identify the specific path of the flow and is thus resilient to obfuscation. In its evaluation, our MUTAFLOW prototype for Android programs showed that mutation-based flow analysis is a lightweight yet effective complement to existing tools. Compared to the popular FLOWDROID static analysis tool, MUTAFLOW requires less than 10% of source code lines but has similar accuracy; on 20 tested real-world apps, it is able to detect 75 flows that FLOWDROID misses

    Comparative validation of machine learning algorithms for surgical workflow and skill analysis with the HeiChole benchmark

    Get PDF
    Purpose: Surgical workflow and skill analysis are key technologies for the next generation of cognitive surgical assistance systems. These systems could increase the safety of the operation through context-sensitive warnings and semi-autonomous robotic assistance or improve training of surgeons via data-driven feedback. In surgical workflow analysis up to 91% average precision has been reported for phase recognition on an open data single-center video dataset. In this work we investigated the generalizability of phase recognition algorithms in a multicenter setting including more difficult recognition tasks such as surgical action and surgical skill. Methods: To achieve this goal, a dataset with 33 laparoscopic cholecystectomy videos from three surgical centers with a total operation time of 22 h was created. Labels included framewise annotation of seven surgical phases with 250 phase transitions, 5514 occurences of four surgical actions, 6980 occurences of 21 surgical instruments from seven instrument categories and 495 skill classifications in five skill dimensions. The dataset was used in the 2019 international Endoscopic Vision challenge, sub-challenge for surgical workflow and skill analysis. Here, 12 research teams trained and submitted their machine learning algorithms for recognition of phase, action, instrument and/or skill assessment. Results: F1-scores were achieved for phase recognition between 23.9% and 67.7% (n = 9 teams), for instrument presence detection between 38.5% and 63.8% (n = 8 teams), but for action recognition only between 21.8% and 23.3% (n = 5 teams). The average absolute error for skill assessment was 0.78 (n = 1 team). Conclusion: Surgical workflow and skill analysis are promising technologies to support the surgical team, but there is still room for improvement, as shown by our comparison of machine learning algorithms. This novel HeiChole benchmark can be used for comparable evaluation and validation of future work. In future studies, it is of utmost importance to create more open, high-quality datasets in order to allow the development of artificial intelligence and cognitive robotics in surgery

    Learning Input Tokens for Effective Fuzzing

    Get PDF
    Modern fuzzing tools like AFL operate at a lexical level: They explore the input space of tested programs one byte after another. For inputs with complex syntactical properties, this is very inefficient, as keywords and other tokens have to be composed one character at a time. Fuzzers thus allow to specify dictionaries listing possible tokens the input can be composed from; such dictionaries speed up fuzzers dramatically. Also, fuzzers make use of dynamic tainting to track input tokens and infer values that are expected in the input validation phase. Unfortunately, such tokens are usually implicitly converted to program specific values which causes a loss of the taints attached to the input data in the lexical phase. In this paper we present a technique to extend dynamic tainting to not only track explicit data flows but also taint implicitly converted data without suffering from taint explosion. This extension makes it possible to augment existing techniques and automatically infer a set of tokens and seed inputs for the input language of a program given nothing but the source code. Specifically targeting the lexical analysis of an input processor, our lFuzzer test generator systematically explores branches of the lexical analysis, producing a set of tokens that fully cover all decisions seen. The resulting set of tokens can be directly used as a dictionary for fuzzing. Along with the token extraction seed inputs are generated which give further fuzzing processes a head start. In our experiments, the lFuzzer-AFL combination achieves up to 17% more coverage on complex input formats like JSON, LISP, tinyC, and JavaScript compared to AFL

    Temporal Behavioral Parameters of On-Going Gaze Encounters in a Virtual Environment

    No full text
    To navigate the social world, humans heavily rely on gaze for non-verbal communication as it conveys information in a highly dynamic and complex, yet concise manner: For instance, humans utilize gaze effortlessly to direct and infer the attention of a possible interaction partner. Many traditional paradigms in social gaze research though rely on static ways of assessing gaze interaction, e.g., by using images or prerecorded videos as stimulus material. Emerging gaze contingent paradigms, in which algorithmically controlled virtual characters can respond flexibly to the gaze behavior of humans, provide high ecological validity. Ideally, these are based on models of human behavior which allow for precise, parameterized characterization of behavior, and should include variable interactive settings and different communicative states of the interacting agents. The present study provides a complete definition and empirical description of a behavioral parameter space of human gaze behavior in extended gaze encounters. To this end, we (i) modeled a shared 2D virtual environment on a computer screen in which a human could interact via gaze with an agent and simultaneously presented objects to create instances of joint attention and (ii) determined quantitatively the free model parameters (temporal and probabilistic) of behavior within this environment to provide a first complete, detailed description of the behavioral parameter space governing joint attention. This knowledge is essential to enable the modeling of interacting agents with a high degree of ecological validity, be it for cognitive studies or applications in human-robot interaction

    Rosiglitazone treatment enhances intracellular actin dynamics and glucose transport in hypertrophic adipocytes

    No full text
    Aims: To accommodate surplus energy, adipose tissue expands by increasing both adipose cell size (hypertrophy) and cell number (hyperplasia). Enlarged, hypertrophic adipocytes are known to have reduced insulin response and impaired glucose transport, which negatively influence whole-body glucose homeostasis. Rosiglitazone is a peroxisome proliferator-activated receptor gamma (PPARγ) agonist, known to stimulate hyperplasia and to efficiently improve insulin sensitivity. Still, a limited amount of research has investigated the effects of rosiglitazone in mature, hypertrophic adipocytes. Therefore, the objective of this study was to examine rosiglitazone's effect on insulin-stimulated glucose uptake in hypertrophic adipocytes. Main methods: C57BL/6J male mice were subjected to 2 weeks of high-fat diet (HFD) followed by 1 week of HFD combined with daily administration of rosiglitazone (10 mg/kg). Adipose cell-size distribution and gene expression were analysed in intact adipose tissue, and glucose uptake, insulin response, and protein expression were examined using primary adipocytes isolated from epididymal and inguinal adipose tissue. Key findings: HFD-feeding induced an accumulation of hypertrophic adipocytes, which was not affected by rosiglitazone-treatment. Still, rosiglitazone efficiently improved insulin-stimulated glucose transport without restoring insulin signaling or GLUT4 expression in similar-sized adipocytes. This improvement occurred concurrently with extracellular matrix remodelling and restored intracellular levels of targets involved in actin turnover. Significance: These results demonstrate that rosiglitazone improves glucose transport in hypertrophic adipocytes, and highlights the importance of the cytoskeleton and extracellular matrix as potential therapeutic targets
    corecore